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Researchers in various related fields research preventing and controlling the 
spread of the coronavirus disease (COVID-19) virus. The spread of the 
COVID-19 is increasing exponentially and infecting humans massively. 
Preliminary detection can be observed by looking at abnormal conditions in 
the airways, thus allowing the entry of the virus into the patient's respiratory 
tract, which can be represented using computer tomography (CT) scan and 
chest X-ray (CXR) imaging. Particular deep learning approaches have been 
developed to classify COVID-19 CT or CXR images such as convolutional 
neural network (CNN), and deep convolutional neural network (DCNN). 
However, COVID-19 CXR dataset was measly opened and accessed. 
Particular deep learning method performance can be improved by 
augmenting the dataset amount. Therefore, the COVID-19 CXR dataset 
was possibly augmented by generating the synthetic image. This study 
discusses a fast and real-like image synthesis approach, namely depthwise 
boundary equilibrium generative adversarial network (DepthwiseBEGAN). 
DepthwiseBEGAN was reduced memory load 70.11% in training processes 
compared to the conventional BEGAN. DepthwiseBEGAN synthetic images 
were inspected by measuring the Fréchet inception distance (FID) score with 
the real-to-real score equal to 4.3866 and real-to-fake score equal to 4.4674. 
Moreover, generated DepthwiseBEGAN synthetic images improve 22.59% 
accuracy of conventional CNN models. 
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1. INTRODUCTION 


SARS-Cov-2 or known as coronavirus disease (COVID-19) was first identified in Wuhan, China [1] 
and designated by World Health Organization (WHO) as a global epidemic [2] because it has infected all 
corners of the world. At the end of February 2021, there were 116,521,281 confirmed cases, with 
116,521,281 active cases and 2,589,548 cases of death. Meanwhile, the spread of COVID-19 in Indonesia 
also continues to increase, with February 2021 confirmed 1,379,662 cases, with 14,518 active cases, and 
37,266 death cases [3]. Therefore, steps are needed to prevent, detect, and control the spread of the COVID- 


19 virus. 


Early detection is one way to break the chain of the COVID-19 virus spread. When a patient is 
known to be positive for COVID-19, he will undergo a quarantine period so that the chain of spread can be 
broken by tracking the people who had interacted with the patient. One of the tests that can be done in early 
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detection of COVID-19 virus is by conducting a test called reverse transcription-polymerase chain reaction 
(RT-PCR) to find out whether a patient is indicated as positive or negative from the COVID-19 virus 
infection. As time goes by, the pandemic condition continues even though the data on the spread is still high. 
Unfortunately, RT-PCR testing has less accurate results (40% to 60%) [4], [5] in determining positive or 
negative status of being infected with the COVID-19 virus [6], [7]. 

Alternative methods to detect the spread of the COVID-19 virus are through chest screening, 
namely computer tomography (CT) scan and chest X-ray (CXR) [8]. The resulting image of CT or X-ray has 
higher sensitivity than testing using RT-PCR. Thus, many automation systems have been developed in CT 
and X-rays image processing [6]. Displaying images via CT can detect COVID-19 virus infection. However, 
the procedure for testing via CT is costly, as well as age-restricted and forbidden for pregnant women 
because of the radiation. Therefore, studies in [9] use the CXR process that can be used for easy, fast, and 
inexpensive testing. Various deep learning methods are used as an automation system for CXR image 
processing to support the detection process for the COVID-19 virus infection. The convolutional neural 
network (CNN) method obtained accuracy in positive/negative CXR classification of 98.50% [10], while the 
deep neural network (DNN) method received accuracy in the positive/negative CXR classification of 98.08% 
[4]. CXR classification using deep CNN (DCNN) has an accuracy of 87.3% [11], while classification using 
generative adversarial networks (GANs) have an accuracy of 95% [12]. The depthwise separable convolution 
(DSC) network has an accuracy of 99.50% [13], and the COVIDX-Net has an accuracy of 91% [14]. On the 
other hand, public dataset images of CT or X-ray are limited. However, the classification method can be 
tuned by augmenting the dataset. 

The deep learning approach is an interesting topic to develop an automation system to diagnose 
CXR images of the COVID-19 virus. LightCovidNet, which consists of a lightweight CNN (LW-CNN) and 
GANs with a frontal CXR dataset of 446 (resolution 1024X1024 pixels), with network filters to 841.771 
parameters successfully trained the data with an accuracy of 96.97%. The separable convolution technique 
can reduce the memory load when processing training data (training data) 27 times more efficiently than 
conventional CNN, which consists of 23,567,299 parameters [15]. CovidGAN which consists of CNN and 
auxiliary classifier GANs (AC-GANs) methods using 403 CXR datasets (14,000,000 parameters) increases 
the accuracy of conventional GANs data augmentation 85% to 95% [12]. Covid-Net using the DCNN 
method using 13.975 CXR datasets (CovidX dataset of 11,750,000 parameters) yields an accuracy of 93.3% 
[16]. Coro-Net using the DNN method using 125 CXR datasets (33,915,436 parameters) yields an accuracy 
of 95% [17]. RANDGAN (randomized GAN) is ANO-GANs, using 573 CXR datasets resulting in an 
accuracy of 71% [5]. GANs and ResNet18 used the 5863 CXR datasets resulting in an accuracy of 99% [18]. 

To improve the model performance of the classification methods, we proposed a new architecture 
called DepthwiseBEGAN in which combining depthwise separable convolution (DSC) and BEGAN. This 
approach proposes augmented synthetic images of COVID-19 CXR dataset using DCGAN, DeptwiseGAN, 
BEGAN, and DepthwiseBEGAN. To exhibit DepthwiseBEGAN reduces the training load while the synthetic 
images are generated. In this research, we also measured the quality of generated images using Fréchet 
inception distance (FID). Additionally, the improvement of the classification method using generated 
synthetic images as fake CXR datasets is presented in this paper. Several classification models are used, such as: 
ResNet18, ResNet34, ResNet50, and GoogleNet., such as ResNet18, ResNet34, ResNet50, and GoogleNet. 


2. RESEARCH METHOD 
2.1. Depthwise separable convolution 

CNN is a subclass of DNNs that can solve vision problems. CNN consists of the primary process, 
namely features extraction and fully connected layer. The convolutional layer is the fundamental layer of 
CNN that determines the characteristics of the image pattern as an input matrix to traverse through filters. 
By assuming (xi) as an input tensor which consist of triple index such as height (ht), width (wt), and 
depth (d’). Spatial location of (ht, w’) utilized from bank filter of f. d! is a receptive field in x'. Therefore, 
the output of CNN layer can be denoted as [18], [19]: 


—-yw yH yo l 
Vyl+1 nitt a = Lw=0 Ln=0 La=0 fw,h,d X Xnlttanw'ttiwd (1) 

where Dx Dy is an input matrix per M channels, therefore, total parameters in a kernel formulated as (2). 
D x DŻ xM xN (2) 


CNN models with high-resolution images require more memory allocation due to the number of 
convolutional parameters in the kernel that must be calculated as vectors. Therefore, several CNN models 
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can be simplified by reducing convolutional trainable parameters. DSC is a model that effectively reduces the 
number of convolutional parameters and matrix calculations with the limitations of precision. Conventional 
CNN utilized a convolutional kernel with the same input channel so that the matrix calculation is carried out 
by convolution channels per N channel, with the total parameters shown in (4) [20]. DSC consists of 2 
convolution processes, namely depthwise convolution and pointwise convolution. Based on (1), DSC to 
distribute feature learning, namely depthwise and pointwise [2], formulated as [18], [19], [21]: 


l 
Vwittp_ltig = ar fa X ae bie fwn X Xhltt4hwlt14w (3) 


where fy is a pointwise in which 1x1 convolution layer. Figure 1 represent DSCN parameters utilized for 
computing the parameter load for each training process. 


Depthwise Pointwise 
Filters Filters 


Figure 1. DSCN parameters in CXR of COVID-19 


The number of parameters in the DSCN per 1 channel is denoted in (4). However, the total 
parameters in the DSCN are the total parameters in the depthwise convolution and pointwise convolution 
calculated as shown in (4). Compare to the (2), a comparison of the number of parameters on the CNN 
standard with the DSCN shown in (5) [22]. For example, if a convolution has N = 1024 and D, = 3, there 
will be a reduction in the convolution parameters in the training process by 0.112 or 0.888 which mean DSC 
is able to reduce the training load than conventional CNN. 


(Dz x M) (Dg x N) (4) 


pDSCN _ (D?xM)(DgxN) _ (DgxN) 
pCNN  (DZxD2xMxN) _ (DgxN) 


(5) 


2.2. Deep convolution generative adversarial network 

GANs were introduced in 2014 by Goodfellow which states that GANs consist of two networks 
namely generator network (G) and discriminator network (D). Both models are trained using the mini-max 
concept. Generator model G(x; 6); can train noise data label on P,(z) distribution data against x label or 
real data label. Discriminator model D(x; 04), trains the P, distribution data be able to estimate the 
distribution data Ppgtacx) [23]. The data distribution Ppataçx) is a positive image of CXR COVID-19 [24]. 
Generator model G(z; 8), minimizes the probability data distribution in the fake dataset z~P, that 
formulated as (6) [25]: 


min V(G) = E,-p, [tog (1 - D(G@)))| (6) 


As shown in (6) shows the generator networks able randomized noise data distribution P,(z) to fool 
discriminator network which labelled in data distribution of Z~P pata(x)- Thus, Discriminator model D(x; 04) 
maximize the probability data distribution in Ppata(x) formulated as (7) [25]. 


a VCG) = Ex~Paatac) [log (D œ))] K 
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Therefore, GANs mini-max term based on (6) and (7) can be formulated as (8) [25]: 


L „ =minmaxV(G,D)=E 
G D 


adv i [log(D (x))] T E. [logd -D (G(z)))] (8) 


Pe sre sre 


where E(.) denotes as network expectation given by generator network and discriminator network, V(G, D) 
is a training criterion of discriminator network given by generator network, where D: x > {Dsre (x), Deis (x)} 
denotes discriminator probability distributions over both source and its labels. Both discriminator network 
D(.) and generator network G(. ) is able to be optimized using given objective functions formulated as [23]: 


Lis = EE ~Pyatacx) [-log Deas (x)] (9) 
Lfs = Ezp, |-logDas (6) )] (10) 
Lrec = Ex~p, ||lx - G(G@))II, | (11) 


where A,;,;denotes as hyperparameters that optimize domain classification loss of discriminator (£%,) and 
generator (£f). Ares denotes as hyperparameters that optimize reconstruction loss (£,es) that adopt L, 


normalization [23]. £L,e, translated G(z) into x~P,which mean that the generator G,(.) tries to reconstruct 
fake labels into real labels. 


2.3. Depthwise boundary equilibrium GAN 

Figure 2 shows DepthwiseBEGAN architecture given an input image shape (32, 32, 3). DSConv is 
depthwise separable convolution layer which contains depthwise layer and pointwise layer. Down-sample 
size transformed from given input shape 32x32 into 4x4 and 8x8. 
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remap 


r \ 
/ A PISADO $ ee ee 4 : \ 
|  { OSonaw=Gaid= Cin) 1 ow = aid = Gada! | n SONS Gas) | | 
| BacthNorm(n) ; ba | BacthNorm (2n) BacthNorm (3n) = | 
= DSconv,:w = (3,3); = (n,n) | 1 | OSConvy:w = (3,3);d = (2n,3n) È + | psconve:w = (3,3);d = (3n,3n) | | È | 
po) |l | BacthNorm(n) =} 8 BacthNorm (3n) | ti BacthNorm (3n) rep | 
x iI | 2stomw= Bahd = (n20) | 4 aD i A limit | 
j Bacthi 2n) H 'ampling(2.2) ' joi! 
| 


SubSampling(2,2) 
x rns eee 


Te l = 
| W LY Sis tal oni oa ao ne el ee ; i | Validation 
i [roenencnwsnnsaceennaenanecncnsanccsensnnenasaseesnsaesennasscens | Fake / Real 


| 

AN 5 ' i ' | ; 
D(G(x)) i Hy = FC(8 x 8 x 3n, 128) l avaga vG i : 

l ps i g! BacthNorm (n) i BacthNorm (n) | 

I | pacer area GT | pscoww=Cis\id= (un ! DsConvs-w = (3,3);d = (n,n) | I 

i | | oscom;w=(33);d= (nn) || BacthNorm (n) ! Peers i l l 

Baaai i eea ina 
i fe XO) 1 | UpSampling(2,2) {| Dsconvs:w = (33);d = (2,3) | I 
\ UpSampling (2.2) H E-a J f l 

E RE / i 

I SS 7 

1 Cr Pe I PE IE A EA E VEE EN EEN E ee - 1 

Pa PE EN a ee ee = J 

Ap a AE enin AAE D AED D O D A A E DA eiS D A R Optimizer 
6) \ 


iw = (33d = Gun) ft | DSConvyw =(33):d=(nn) : 
BacthNorm (n) : BacthNorm (n) 4 
i 3):d = (nn) | 


ee Ml u M ‘M ‘M M 


Figure 2. Architecture of DepthwiseBEGAN of COVID-19 CXR images 


The term of equilibrium to balance auto-encode real dataset (x; 04) and discriminate G (z : 83) which 
equalized as (12) [26]: 


E[L£(G(z))| = yvE(L(@)) (12) 
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where y denotes as diversity ratio, which maintains the equilibrium using proportional control theory 
(k, E€ [0,1]). Based on (12), boundary equilibrium GAN (BEGAN) represent as an objective function as 
(13)-(15) [27], [28]: 


Ly = L(x) — k. L(G(zp)) (13) 
Le = L(G(z¢)) (14) 
k = k, +A (g£L(x)-L(G(z,))) (15) 


where A, is a proportional gain for k , L(x) = E, p aata(x) LllDeis (x) — x|],] denotes as auto-encoder loss 


L,of data distribution x~Ppata(x)> and L£(G(z)) = Exp, [||Dais(G(2)) — cll] denotes as auto-encoder 


loss L,of data distribution z~P,. Essentially, (16) shows a form of closed-loop feedback control, in which k, 
is adjusted at each step t + 1. The equilibrium constraint manages the training process to yielding to L(x) > 
L(G (z)). Therefore, convergence global measurement of the equilibrium which denotes as (16) [27], [28]. 


Motovar = L) + Iy£(x) — L(G) (16) 
2.4. Fréchet inception distance 

Fréchet inception distance (FID) is utilized as a metric to assess image quality of GANs which 
approximates the distribution of fake generated images Ders (G (zq)) with the distribution of real images 
Deis (x) that were used to train the generator as multivariate Gaussians as (17) [29]: 


|a a ull + Lr + Xg — 2y yay) (17) 


where X,~(t,,¥;;) denotes as mean of 2048-dimensional activation and X g~ (Ug, Yo) denotes the 
covariance of 2048-dimensional activation which extracted from pre-trained Inception-v3 model. Our model 


transforms data distribution of real E,p aaa and fake Ey p into 32x32, 64x64, 128x128, and 256x256 


image dimension. 


3. RESULTS AND DISCUSSION 

The training process was performed using cloud instance Intel(R) Xeon(R) CPU @ 2.30 GHz, 
high-memory VMs, 2xvCPU, 25GB RAM, and GPU using NVIDIA P100/T4, peripheral component 
interconnect express (PCI Express) 16 GB. The training process consists of three schemas. The first schema 
was trained a conventional DCGAN, BEGAN, and DepthwiseGANs using particular dataset distributions. 
This schema generates real-like fake images or generate synthetic CXR image. The second schema was 
calculated the quality of augmented images divides into several batches of random images. The third schema 
was utilized to test whether the augmented images can be classified using particular classification method 
such as CNN. 


3.1. Data distributions and hyperparameters 

This paper represents three kind of datasets such as MNIST dataset, CelebA dataset, and COVID-19 
CXR dataset. GAN model trained on 60K MNIST images, 24K CelebA images, and 5.4K CXR images to 
generate realistic image synthesis. This proposed approach trained both generator and discriminator network 
using Adam with an initial learning rate a = 0.0001, 6, = 0.5, B, = 0.999, proportional gain (A, = 0.7) 
[30], varied image transformation 32 to 256. The hyperparameters in Table 1 show that DepthwiseGANs can 
propose data augmentation to generate synthetic CXR image using randomized noise inputs. 

Based on Table 1, the hyperparameters engages performances among GAN types, especially in 
image-to-image translation containing DCGAN, DepthwiseGAN, BEGAN, and DepthwiseBEGAN. Figure 1 
shows the generator and discriminator architecture as a convolutional feature extraction which down-sampled 
in 4x4 and 8x8. DCGAN has 7.12 million trainable parameters, DepthwiseDCGAN has 0.76 million trainable 
parameters, BEGAN has 8.44 million trainable parameters, and DepthwiseBEGAN has 2.23 million trainable 
parameters. Additionally, this research compares the hyperparameters combination shown in Table 1 to 
analyze the model performance based on generator loss (£,), discriminator loss (£p), and the execution 
time. 
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3.2. DepthwiseBEGAN training performance 

Based on (11) until (15), generator loss (£¢) and discriminator loss (£p) calculated in epoch=25, 
filters=64, image resolution 32x32, random noise=48, and down-sampled size 4x4. Then, Table 2 represent 
generator loss (Lg) and discriminator loss (£p) of DCGAN and DepthwiseGAN, which consists of 
following datasets such as MNIST, CelebA, and CXR dataset. Table 2 represent DCGAN and 
DepthwiseGAN training metrics. 


Table 1. Hyperparameters 


Description Parameters 
Model DCGAN [30], DepthwiseGAN [23], BEGAN [29], DepthwiseBEGAN 
Dataset MNIST, CelebA [29], CXR 
Down-sampled Size 8x8 [29], 4x4 
Filters/Batch Size 4, 32, 64 
Noises Input 48 
Epoch 25 
Image Resolution 32x32, 64x64, 128x128 [29], 256x256 


Table 2. Loss of DCGAN and DepthwiseGAN using particular datasets 


Model Dataset Lg Lp Exec. Time (Minutes) 
DCGAN MNIST 5.2513 0.1130 33.1355 
CelebA 2.3994 0.7012 37.1142 
CXR 1.7703 0.8355 29.7863 
DepthwiseGAN MNIST 3.7615 0.2940 24.6837 
CelebA 2.5109 0.6646 21.3387 
CXR 2.3558 0.6418 17.5 


Table 2 shows generator loss (£,), discriminator loss (£p), and execution time of both models 
DCGAN and DepthwiseGAN. Generator loss (£,) and discriminator loss (£p) of both models DCGAN and 
DepthwiseGAN closely fit, but the execution time of DepthwiseGAN is lower than DCGAN which follows 
the number reduction of trainable parameters. Table 3 shows generator loss (£,), discriminator loss (£p), 
and execution time of both models DCGAN, DepthwiseGAN, BEGAN, and DepthwiseBEGAN in 
epoch=25, filters=(4 and 32), image resolution=(64x64, 128x128, and 256x256), and random noise=48. 


Table 3. Loss of DCGAN, DepthwiseGAN, BEGAN, and DepthwiseBEGAN using CXR datasets 


Model Filters Image Resolution Le Lp Exec. Time (Minutes) 
DCGAN 32 64x64 1.4017 0.6744 49.7863 
DepthwiseGAN 32 64x64 1.5538 0.5188 20.2133 
BEGAN 32 64x64 0.0451 0.0885 76.5556 
DepthwiseBEGAN 32 64x64 0.0465 0.0811 22.8859 
BEGAN 32 128x128 0.0635 0.0789 132.3350 
DepthwiseBEGAN 32 128x128 0.0643 0.0799 48.7891 
BEGAN 4 256x256 0.0797 0.0989 186.4425 
DepthwiseBEGAN 4 256x256 0.0785 _ 0.0965 117.4362 


Based on Table 3 DepthwiseBEGAN execution time was faster than BEGAN execution time in 
particular filers and image resolution. Meanwhile, generator loss (£,) and discriminator loss (£p) closely fit 
in the training stage. DepthwiseBEGAN is able to augment synthetic images with 256x256 pixels. However, 
the number of filters was reduced because of GPU limitations. 

DepthwiseBEGAN is shown in Figure 3. Generator loss (Lg) and discriminator loss (£p) of 
DepthwiseBEGAN shown in Figure 3(a), proportional control (k,4,), and convergence global (Motovat) of 
DepthwiseBEGAN shown in Figure 3(b), and domain classification loss of discriminator (Lýs), domain 


classification loss of generator (£f) and reconstruction loss (L;¢;) shown in Figure 3(c). 


3.3. DepthwiseBEGAN performance measurement 

Based on Figure 2 DepthwiseBEGAN performed in filter size=4, image resolution=256x256, 
epoch=25, and random noise=48. Image quality of GANs is able to be assessed by measuring FID, which 
approximates the distribution of real-to-real images (RR), the distribution of fake-to-real (FR) images, and 
the distribution of fake-to-fake (FF) images. Measurement of FID was captured in Table 4 for each 12K 
iteration steps in DepthwiseBEGAN training process. 
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Figure 3. DepthwiseBEGAN (a) training loss Lg and Lp, (b) Mgiopay and k41, and 


(c) domain loss Liis, es and Lp 


Table 4. FID score of BEGAN and DepthwiseBEGAN 


Model Batch FIDRR FIDFF _ FIDFR 
BEGAN 1 4.3866 4.6633 4.4098 
DepthwiseBEGAN 4.4674 
BEGAN 5 17.2621 12.9109 20.9808 
DepthwiseBEGAN 25.9938 
BEGAN 20 59.2829 39.3068 69.5281 
DepthwiseBEGAN 77.2037 
BEGAN 50 104.9618 68.2335 146.6602 
DepthwiseBEGAN 157.7203 


Table 4 measures FID with several batches containing 1, 5, 20, and 50 images. GANs evaluate by 
propagating the distribution of RR or FR or FF using pre-trained Inception-v3. A proportional FID value 
while measuring the same image equal to zero. By calculating random image in batch size is 1, FID value of 
RR equal to 4.3866, FID value of FR equal to 4.4098 and FID value of FF equal to 4.6633. Synthetic images 
augmented by DepthwiseBEGAN is shown in Figure 4. Therefore, Figure 4(a) shows synthetic images of 
CXR with normal label which augmented by DepthwiseBEGAN. Figure 4(b) shows synthetic images of 
CXR with normal bacteria/virus label which augmented by DepthwiseBEGAN. 

Synthetic images of fake generated images Des (G (z)) in CXR dataset has been augmented, which 
distributes normal label of train images equal to 12.49K images, the normal label of validation images equal 
to 3.75K images, the normal label of test images equal to 7.13K images, the virus label of train images equal 
to 12.98K images, the virus label of validation images equal to 3.49K images, the virus label of test images 
equal to 6.49K images, the bacteria label of train images equal to 20.01K images, the bacteria label of 
validation images equal to 4.99K images, and the bacteria label of test images equal to 7.49K images. 

Augmented CXR images trained using several CNN models such as RestNet18 [27] , ResNet-50 
[16], and VGG19 [28]. The real and fake generated data distribution with input resolution 128x128 trained 
using Adam optimizer with an initial learning rate a = 0.0001, 6, = 0.5, 6, = 0.999 [30] and 50 iterations. 
In order to represent the performance of particular CNN models, this paper assigned the accuracy, specificity, 
sensitivity, positive predictive value (PPV), and negative predictive value (NPV), can be formalized as [27]: 
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Accuracy = (TP + TN)/(TP + TN + FP + FN) (18) 
Sensitivity = (TP)/(TP + FN) (19) 
Specificity = (TN)/(TN + FP) (20) 
PPV = (TP)/(TP + FP) (21) 
NPV = (TN)/(TN + FN). (22) 


Based on (19) until (22), sensitivity and specificity are defined for a domain of binary classification. 
Sensitivity determines whether the ‘virus’ label meets the condition in TP divided by TP and FN. Specificity 
determines the virus label does not meet condition means FP divided by FP and TN. Positive predictive value 
(PPV) is determines the ‘virus’ label meets condition of positive direction means TP divided by TP and FP. 
Negative predictive value (NPV) is determines the ‘virus’ label meets condition of negative direction means 
TN divided by TN and FN. Figure 5 shows CNN models utilized to classify the CXR images based on the 
following labels, namely normal label, bacteria label, and virus label. Figure 5(a) represents CNN training 
accuracy using real CXR datasets and Figure 5(b) represents CNN training accuracy using generated or fake 
CXR datasets. 

Figure 5 represent the CNN training accuracy using real and fake CXR dataset. Based on (18) until 
(22) the confusion matrix has calculated. The confusion matrix of the following CNN models was calculated 
using 100 images from particular sources which shown in Table 5. 


Figure 4. Synthetic images augmented by DepthwiseBEGAN (a) normal label and (b) bacteria/virus label 
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Figure 5. CNN training accuracy using (a) real and (b) fake CXR images 
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Table 5. CNN Confusion matrix 
Data Dist. Models Labels Sensitivity (%) Specificity (%) PPV (%) | NPV (%) 


Real GoogleNet Normal 64.29 75.00 50.00 84.38 
Deis (X) Virus 43.75 64.71 36.84 70.97 
Bacteria 40.00 83.33 64.54 67.57 

Fake Gen. GoogleNet Normal 100.00 97.22 93.33 100.00 
Das(G(z)) Virus 93.75 94.12 88.24 96.97 
Bacteria 90.00 100.00 100.00 93.75 

Real ResNet-18 Normal 78.57 88.88 73.33 81.43 
Deis (X) Virus 68.75 76.47 57.89 83.87 
Bacteria 70.00 93.33 87.5 82.35 

Fake Gen. ResNet-18 Normal 100.00 97.22 93.33 100.00 
Das (G(z)) Virus 100.00 97.06 94.12 100.00 
Bacteria 90.00 100.00 100.00 93.75 

Real ResNet-34 Normal 80.00 88.57 75.0 91.18 
Deis (X) Virus 72.22 78.13 65.00 83.33 
Bacteria 70.59 93.93 85.72 86.11 

Fake Gen. ResNet-34 Normal 100.00 97.22 93.33 100.00 
Das (G(z)) Virus 100.00 96.97 94.44 100.00 
Bacteria 89.47 100.00 100.00 93.93 

Real ResNet-50 Normal 61.54 89.19 66.67 86.84 
Deis (X) Virus 62.50 76.48 55.55 81.25 
Bacteria 76.19 86.21 80.00 83.33 

Fake Gen. ResNet-50 Normal 100.00 91.89 81.25 100.00 
Das(G(z)) Virus 93.75 97.06 93.75 97.06 
Bacteria 85.71 100.00 100.00 90.63 


4. CONCLUSION 

One of the most common procedures to detect COVID-19 by chest screening using X-ray 
technology, CXR imaging accurately identifies whether a patient is infected with the COVID-19 virus or not. 
A computational approach proposed, such as CNN can classify CXR images within three labels: normal 
label, bacteria label, and COVID-19 label. Covid-Net trained the highest CXR images (14K images) to 
classify CXR images with 93% accuracy. Several methods utilized small CXR images to be trained on less 
than 10K images. Image synthesis method proposed to augment CXR images of COVID-19 within the goals 
to increase classification method accuracy. 

In resolution 64x64, DCGAN trained to augmented CXR image synthesis within generator loss 
equal to 1.4017, discriminator loss equal to 0.6744, and execution time equal to 49.7863 minutes. 
DepthwiseGAN trained to augmented CXR image synthesis within generator loss equal to 1.5538, 
discriminator loss equal to 0.5188, and execution time equal to 20.2133 minutes. DepthwiseGANs have 
shortened execution time for a better-generated image by the discriminator. Quality of DepthwiseGANs 
improved by using the encoder-decoder model of GANs, namely BEGAN. BEGAN trained to augmented 
CXR image synthesis within generator loss equal to 0.0451, discriminator loss equal to 0.0885, and execution 
time equal to 76.5556 minutes. DepthwiseBEGAN trained to augmented CXR image synthesis within generator 
loss equal to 0.0465, discriminator loss equal to 0.0811, and execution time equal to 22.8859 minutes. 

FID of DepthwiseBEGAN measured by comparing the number of image batches and the source of 
the image. Measurement of FID using one random image calculated fake-to-fake (FF) equal to 4.6633, 
real-to-real (RR) equal to 4.3866, and fake-to-real (FR) equal to 4.4674. Furthermore, generated 
DepthwiseBEGAN synthetic images improve 22.59% accuracy of conventional CNN models. 
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